Skip to content

profiles: use single Profile.sample_type #649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

florianl
Copy link
Contributor

@florianl florianl commented May 2, 2025

Implements #633 (comment).

FYI: @open-telemetry/profiling-approvers

Implements open-telemetry#633 (comment).

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Co-authored-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
@florianl florianl requested review from a team as code owners May 2, 2025 12:47
@aalexand
Copy link
Member

aalexand commented May 2, 2025

Should default sample type go to the scope message?

Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co>
@christos68k
Copy link
Member

christos68k commented May 5, 2025

Should default sample type go to the scope message?

That would be less flexible as it'd force us to have separate instrumentation scopes (ScopeProfiles) for different sample types. With the current proposal that's a possibility but not a requirement. The OTel documentation for instrumentation scope is sufficiently vague to allow both possibilities. For example, with the OTel eBPF profiling agent, we could have a single instrumentation scope (the eBPF profiler receiver/library) produce profiles with different sample types (cpu and memory).

Do you have a use-case in mind where associating different sample types with separate instrumentation scopes would be a benefit?

(I don't have a strong opinion on this, just that if we don't have a good reason to force a separate instrumentation scope per sample type, I'd go for more flexibility).

@aalexand
Copy link
Member

aalexand commented May 6, 2025

Should default sample type go to the scope message?

That would be less flexible as it'd force us to have separate instrumentation scopes (ScopeProfiles) for different sample types. With the current proposal that's a possibility but not a requirement. The OTel documentation for instrumentation scope is sufficiently vague to allow both possibilities. For example, with the OTel eBPF profiling agent, we could have a single instrumentation scope (the eBPF profiler receiver/library) produce profiles with different sample types (cpu and memory).

Do you have a use-case in mind where associating different sample types with separate instrumentation scopes would be a benefit?

(I don't have a strong opinion on this, just that if we don't have a good reason to force a separate instrumentation scope per sample type, I'd go for more flexibility).

I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

@florianl
Copy link
Contributor Author

florianl commented May 6, 2025

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

The main purpose of the proposal is to align values with timestamps as laid out in #633 (comment).

In general google/pprof it is possible to have multiple sample_types per Profile with special handling of sample_type.unit=count but does not have detailed timestamps but just a durations. With a slightly more focus on timestamps in OTel Profiling, I think, the concept of having multiple sample_types, like in google/pprof, conflicts. IIRC @felixge and I discussed the situation, where one wants to report profiles with different sample types and we concluded that it should be fine, to have multiple message Profile with their dedicated sample_type. I think, it also reduces complexity by not having a concept of a default sample_type and simplifies also the handling of message Profiles in the backend, if they just want to focus on a certain sample_type.

@christos68k
Copy link
Member

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

The main purpose of the proposal is to align values with timestamps as laid out in #633 (comment).

In general google/pprof it is possible to have multiple sample_types per Profile with special handling of sample_type.unit=count but does not have detailed timestamps but just a durations. With a slightly more focus on timestamps in OTel Profiling, I think, the concept of having multiple sample_types, like in google/pprof, conflicts. IIRC @felixge and I discussed the situation, where one wants to report profiles with different sample types and we concluded that it should be fine, to have multiple message Profile with their dedicated sample_type. I think, it also reduces complexity by not having a concept of a default sample_type and simplifies also the handling of message Profiles in the backend, if they just want to focus on a certain sample_type.

@aalexand (I misread your previous comment and thought you wanted to lift sample_type to the scope message). Regarding default_sample_type, are you referring to introducing it in the scope message in order to make sample_type in each profile optional or do you have something else in mind? Lile @florianl mentioned the current proposal splits different sample_types across different profiles.

@aalexand
Copy link
Member

aalexand commented May 6, 2025

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

@aalexand (I misread your previous comment and thought you wanted to lift sample_type to the scope message). Regarding default_sample_type, are you referring to introducing it in the scope message in order to make sample_type in each profile optional or do you have something else in mind? Lile @florianl mentioned the current proposal splits different sample_types across different profiles.

As one example, Go heap profiles have four sample types today: alloc_objects, alloc_space, inuse_objects, inuse_space. There are two profile types with this sample type set:

  • The heap profile sets the default sample type to empty string (that is, unsets it) which means that tools visualizing the profile should use the last sample type as the default one (the "use the last one if the default is not set" is a non-intuitive convention but this is what the pprof format has historically).
  • The alloc profile on the other hand sets the default sample type to alloc_space, so that tools like pprof focus the user on that metric by default.

The question is how such a profile with four sample types and one of them being the preferred default as chosen by the profile producer should be encoded in OpenTelemetry profiles. Before this change it would be the array of sample types plus the default sample type. What would it be after this change and how would the default be captured?

@christos68k
Copy link
Member

The question is how such a profile with four sample types and one of them being the preferred default as chosen by the profile producer should be encoded in OpenTelemetry profiles. Before this change it would be the array of sample types plus the default sample type. What would it be after this change and how would the default be captured?

The current proposal doesn't have an explicit default sample type (and we probably should steer away from introducing implicit conventions) so assuming four different sample types in a single OTel messsage, samples will be grouped according to sample type to separate profiles. One possible encoding would have four different profiles, each with a different sample type (of course one could further split the data according to other criteria).

Some options to accommodate the notion of a default sample type (e.g. for a consumer visualization tool) include:

  1. Use an InstrumentationScope attribute
  2. Introduce default_sample_type in ScopeProfiles

1 is more restricted than 2 as it could be seen as purely advisory metadata that a visualization tool can use. Adding default_sample_type to ScopeProfiles could also act as a fallback sample_type for profiles that do not declare it.

I think using nanoseconds is the better default unit for CPU profiles
because the weight of a count can be different depending on the sample
rate.

Using the same unit for off_cpu and cpu is more consistent.

The comment above was a little confusing, since it referred only to a
cpu profile, but was then followed by an off_cpu profile sample as well.
Copy link
Member

@felixge felixge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Sorry for the delay. PTAL at my comments. (I still need to find time to reply to some of the comments above, will do soon)

florianl and others added 4 commits May 27, 2025 14:11
profiles: improve sample_type comment
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@florianl
Copy link
Contributor Author

florianl commented Jun 3, 2025

I have added default_sample_type as suggested by @christos68k and discussed in the OTel Profiling SIG (2025-06-29):

  1. Introduce default_sample_type in ScopeProfiles

I didn't mark Profile.sample_type as optional, as this would conflict with the effort to avoid optional in the proto (see #659).

…e_type

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>

// The preferred type and unit of Samples in at least one Profile.
// See Profile.sample_type for possible values.
ValueType default_sample_type = 4;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a dedicated field? Could it be a scope attribute?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use an attribute (it was the other option here) but then we'd need to ensure key uniqueness and possibly introduce a new semantic convention.

Given that all fields are optional in proto3, and that this specific attribute seems only relevant to profiling, it's simpler I think to have this field here? We could also update the field documentation to mention that it's optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Profiling SIG discussed this topic and agreed to prefer a dedicated field over some other mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants